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BEYOND CHANCE-CONSTRAINED CONVEX MIXED-INTEGER 
OPTIMIZATION: A GENERALIZED CALAFIORE-CAMPI ALGORITHM 
AND THE NOTION OF 5-OPTIMIZATION. 

J. A. DE LOERA, R. N. LA HAYE, D. OLIVEROS, AND E. ROLDAN-PENSADO 


Abstract. The scenario approach developed by Calafiore and Campi to attack chance- 
constrained convex programs (i.e., optimization problems with convex constraints that are 
parametrized by an uncertainty parameter) utilizes random sampling on the uncertainty 
parameter to substitute the original problem with a representative continuous convex 
optimization with N convex constraints which is a relaxation of the original. Calafiore 
and Campi provided an explicit estimate on the size N of the sampling relaxation to yield 
high-likelihood feasible solutions of the chance-constrained problem. They measured the 
probability of the original constraints to be violated by the random optimal solution from 
the relaxation of size N. 

This paper has two main contributions. First, we present a generalization of the 
Calafiore-Campi results to both integer and mixed-integer variables. In fact, we demon¬ 
strate that their sampling estimates work naturally for variables that take on even more 
sophisticated values restricted to some subset 5 of R d . In this way, a sampling or scenario 
algorithm for chance-constrained convex mixed integer optimization algorithm is just a 
very special case of a stronger sampling result in convex analysis. The key elements, 
necessary for all the proofs, are generalizations of Helly’s theorem where the convex sets 
are required to intersect 5 C R d . The size of samples in both algorithms will be directly 
determined by the 5-Hclly numbers. 

Motivated by the first half of the paper, for any subset 5 C R d , we introduce the 
notion of an 5-optimization problem, where the variables take on values over 5. It 
generalizes continuous (5 = R d ), integer (5 = Z d ), and mixed-integer optimization (5 = 
R fc xZ d ~ fe ). We illustrate with examples the expressive power of 5-optimization to capture 
sophisticated combinatorial optimization problems with difficult modular constraints. We 
reinforce the evidence that 5-optimization is “the right concept” by showing that the 
well-known randomized sampling algorithm of K. Clarkson for low-dimensional convex 
optimization problems can be extended to work with variables taking values over 5. 


1. Introduction 

Chance-constrained optimization is a branch of stochastic optimization concerning prob¬ 
lems in which constraints are imprecisely known but the problems need to be solved with 
a minimum probability of reliability or certainty. Such problems arise quite naturally in 
many areas of finance (e.g., portfolio planning where losses should not exceed some risk 

Key words and phrases. Chance-constrainted optimization, Convex mixed-integer optimization, Op¬ 
timization with restricted variable values, Randomized sampling algorithms, Helly-type theorems, 5- 
optimization. 


1 



CHANCE-CONSTRAINED S-CONVEX OPTIMIZATION 


2 


threshold) [Ml 22], telecommunications (services agreements where contracts require net¬ 
work providers to guarantee with high probability that packet losses will not exceed a 
certain percentage) [2TH2T] . and facility location (for medical emergency response stations, 
while requiring high probability of coverage over all possible emergency scenarios) mm- 
Chance-constrained problems are notoriously difficult to solve because the feasible region 
is often not convex and because the probabilities can be hard to compute exactly. For 
information on how to solve such chance-constrainted problems and how to deal with 
probabilistic uncertain optimization see [8l fT0ll25l[T3l 2011231128] and the excellent references 
therein. 

We have two main contributions: 


Sampling in Chance-constrained Convex Mixed Integer Optimization and be¬ 
yond. Our main result is a generalization of the scenario approximation method of Calafiore 
and Campi mm for continuous variables and convex constraints. Here we generalize their 
sampling algorithm for integer and mixed-integer variables. To state the result we need 
the following notions. Let be a probability space. Let f(x.,w): (fL d ~ k x R fe ) x!1aM 
be a convex function on x € T, d ~ k x R k and measurable on w G fb This function / can 
be thought of as representing constraints on T, d ~ k x M fc , one for each value of w. Note 
that “x violates the constraint” is a random event. The probability of violation of a vector 
x G Z d ~ k x R k is defined as V(x) = Pr[{w € LI : f(x,w) > 0}]. We seek a solution x with 
small associated value for V(x), because it means it is feasible for “most” of the problem 
instances. We also hope for our conclusion to hold with high confidence, or equivalently 
we wish to have small amount of distrust for the prediction. 

Corollary 1.1. Let f and H be given as above. Let 0 < e < 1 (tolerance for violation), 
0 < 8 < 1 (distrust or lack of confidence) be chosen parameters. Suppose further that there 
is an optimal value x* of the linear minimization chance-constrained mixed-integer convex 
problem 


T 

mm c x 

subject to V (x) < e 

x G K convex set , 
x G Z d ~ k x R k . 


Then from a sufficiently large-size random sample of N different i.i.d. values for w 
(specifically, w 1 , w 2 ,..., w N ), x* can be 5-approximated by the random variable x^, the 
optimal solution of the convex mixed-integer optimization problem 
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T 

mm c x 

subject to /(x, w l ) <0, i = 1,2,..., IV, 
x G K convex set , 
x G Z d ~ k x R k . 

o/oci—A; /ju_i_I \_I \ r\ 

More precisely, ifxw exists and the sample has size N > — - y -—-—- ln(l/e)+^ ln(l/<5)+ 

2(2 d ~ k (k + 1) — 1), then the undesirable event of high-infeasibility V(xn) > e has probability 
less than 5 of occurring. 

Note that when k = 0, we are in the situation of chance-constrained integer convex 
optimization, which is a special case. In fact, Corollary 11.11 follows from a more general 
result. But before we can state it, we need one important definition on convex analysis. 

Definition 1.2. For a nonempty family 1C of sets, the Helly number h = h{fC) € N of /C is 
defined as the smallest number satisfying the following: 

Vil,.. ■, i h € M : F h n ■ ■ ■ n F ih ± 0 => F 1 n • • • n F m + 0 

for all m G N and F\,..., F m G /C. If no such h exists, then h(IC) : = oo. 

E.g., for the classical Helly’s theorem that appears in all books in convexity, K. is the 
family of all convex subsets of R d . 

For S C M d we define 

h(S) := /i({5 FI K : K C M d is convex }). 

That is, h(S) is the Helly number when the sets are required to intersect at points in S] 
we will call this the S-Helly number. 

For instance, when S is finite then the bound h(S) < |5| is trivial. The original Helly 
number is h(M. d ) = d + 1 and, interestingly, if F is any subfield of R (e.g., Q(\/2)), then 
Radon’s proof of Helly’s theorem directly shows that the S'-Helly number of S = T d is still 
d+ 1. Doignon’ theorem [T3] (later rediscovered in [6l [T9l [M] l states that a finite family of 
convex sets in R d intersect at a point of Z d if every 2 d of members of the family intersect 
at a point of 7L d . Another example is the work of A.J. Hoffmann in m and Averkov and 
Weismantel [3] who gave a mixed version of Helly’s and Doignon’s theorems which includes 
them both. This time the intersection of the convex sets is required to be mixed-integer, 
with variables taking values in x R fc , and this can be guaranteed if every 2 d ~ k {k + 1) 
sets intersect in such a point. 

The S- Helly number h(S) is relevant for our purposes as it is a measure of the feasibility 
of a system of convex constraints over S. Essentially, if the system is S'-infeasible, then 
there must be a subsystem of size h(S) or less that is infeasible. E.g., from the classical 
Helly’s theorem one derives that, given (real ) infeasible convex constraints in d variables 
would contain a subset of no more than d + 1 constraints that certifies that the entire 
set has empty intersection (no common solution). It is fair to say that applications in 
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optimization have prompted many papers about Helly numbers 

la ei eh nans* In 

mm the usual Helly number d + 1 played a role for the size of a support set, if we are 
interested on solutions with values on S we can use the S-Helly number to predict the size 
of a support set and recover the sample size. 

We state here our most general theorem (for full details see Section [2]). 

Theorem 1.3. Let S C be a set with a finite Helly number h(S). Let 0 < e < 1 
(tolerance), 0 < 8 < 1 (distrust) be chosen parameters. Let /(x, ) be a convex function 

in x and measurable in w. Suppose there is an optimal value x* of the linear minimization 
chance-constrained problem 

CCP(e) = min c T x 

subject to Pr[/(x, i«) > 0] < e, 
x E K convex set , 
x E S'. 

Then from a sufficiently large random sample of N different i.i.d values for w (specifi¬ 
cally, w 1 , w 2 ,..., w N ), x* can be 5-approximated by x^r, the optimal solution of the convex 
optimization problem 


SCP(N) = min c T x 

subject to /(x, w l ) <0, i = 1,2,..., N, 
x E K convex set, 
x E S. 

More precisely, if xn exists and the size of the sample N > 2 ( h ( s J ^ ln(l/e) + | ln(l/<5) + 
2(h(S) — 1), then the undesirable event of high-infeasibility V(xn) > e has probability less 
than 5 of occurring. 

Indeed, taking S = Tfi~ k x and because its Helly number is h(S ) = 2 d ~ k (k + 1), we 
obtain as an immediate consequence the result for chance-constrained convex mixed integer 
optimization stated in Corollary 11.11 Moreover, we can provide the following guarantee of 
the quality of the solution: 

Theorem 1.4. Let 0 < e < 1 (tolerance), 0 < 6 < 1 (confidence), and N sufficiently large 
(as in Theorem ll.3I note that this depends on S). Let J e be the optimal objective value 
ofCCP(e ) and J N be the optimal objective value of SCP(N) (note that J N is a random 
variable). 

(1) Suppose CCP(e ) is feasible. Then with probability of at least 1 — 5, if SCP(N) is 
feasible, it holds that J e < J N . 

(2) Define ei = 1 — (1 — d) 1 ^. With probability at least 1 — 5, we have J N < J ei . 
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S-optimization and Clarkson’s Algorithm. The essential arguments used in the Calafiore- 
Campi scenario method apply to more complicated variable values over S, well beyond the 
reals or the integers. The proofs are also the same. This motivated us to introduce the 
notion of S-optimization, a natural generalization of continuous, integer, and mixed-integer 
optimization: 

Definition 1.5. Given S C M d , the optimization problem with equations, inequalities and 
variables taking values on S, 


max /(x) 

subject to < 7 j(x) <0, i = 1,2,..., n, 

M x ) = °, j = 1,2,... ,m, 

xgS, 


will be called an S-optimization problem. 

Clearly when S = the S-optimization problem is the usual continuous optimization 
problem, S = Z d is just integer optimization, and S = x is the case of mixed- 

integer optimization. When only linear constraints are present this is an S-linear program. 
When all constraints are convex we call this an S-convex program. This paper presents 
two algorithmic results about S-convex programs. 

But, why study S-optimization? Or, rather, why do it for an unfamiliar set 5? As we 
show below S'-optimization problems have natural expressive power, sometimes using fewer 
or simpler constraints than standard continuous or mixed integer optimization. 

Here are two more unusual examples of S-optimization problems. This time we model 
succinctly with more sophisticated S C (typically discrete sets). 

Example 1.6. Given a graph G = (V,E), we reformulate the classic graph A"-coloring 
query as the solvability of the following linear system of modular inequations: For all 
(i,j) i n E(G) consider the inequations c t ^ Cj mod K. This is a system on |V| variables 
and it has a solution if and only if the graph is AT-colorable. Note that the set of points 
c = (ci,...,C|v|) with Ci = Cj mod K is a lattice, which we call Li.j- Therefore, solving 
our system of inequalities is equivalent to finding a c € -S' = 7) v \ \ ((J i ■ Ljj). Consequently, 
the problem of deciding fc-colorability is equivalent to the problem of finding a solution to 
an S'-linear system of equations, where the variables take values on S, the set difference of 
a lattice and a union of several sublattices. 

Example 1.7. Here is another instance of S'-optimization which has ancestors in [18] . 
We are interested in the the solutions of the following modular mixed-integer optimization 
problem: 
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N 

3xi + 7 x 2 + 4x 3 + £(100 — i)xi 

i> 4 

8 xi + 3 x 2 + 5 x 3 = 6 mod 11 , 

6 xi + 4 x 2 — 3 x 3 = 1 mod 2 , 
x\ ^ X 3 mod 5, 

N 

X\ + X 2 + X 3 + ^ Xi < 1000, 
i> 6 

xi ^ 2,4,16 mod 23, x 2 = 0 mod 2, x 3 = 2 mod 3, 

Xi,x 2 ,X 3 > 0 and integral, 

Xi, i = A, N continuous. 

Note that only the integral variables have modular restrictions. By adding integer slacks, 
we can reformulate this problem as a problem with only six integral variables (with modular 
restrictions) and N continuous variables. 

N 

min 3xi + 7x 2 + 4x 3 + ^^(100 — i)x* 

i> 4 

subject to 8 x 1 + 3 x 2 + 5x 3 = 2/1 
6 x 1 + 4x 2 - 3x 3 = 2 / 2 , 
xi - x 3 = 2 / 3 , 

N 

Xi + X 2 + X 3 + ^2 Xi < 1000, 
i> 6 

xi ^ 2,4,16 mod 23, x 2 = 0 mod 2, x 3 = 2 mod 3, 
yi = 6 mod 11, 2 / 2=1 mod 2, 2 / 3^0 mod 5, 
xi,x 2 ,x 3 > 0 and integral, 

Xi, i = 4,... , N continuous. 

What is the set Scl 6 where the variables take on values for this situation? The answer 
can be described first as 5j x >$2 x S ’ 3 x S 4 x S 5 x Sq x l^ -3 , where Si can be described as the 
difference between Z and the subtraction of cosets (or translated sublattices) with respect 
to lattices of multiples of an integer q. Thus at the the end Si x S 2 x S 3 x S 4 x S 5 x Sq 
can be written as the lattice Z 6 from which we subtract the union of several translated 
sublattices. 

We must remark that the Helly number for the difference of a lattice and a union of 
its sublattices has been estimated in m- To fully stress that S'-optimization is the right 
notion we present a second algorithm which works well for S'-convex programs. 


min 
subject to 
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In ED- K. Clarkson introduced a family of algorithms which, relying on repeated calls 
to an oracle that optimizes small-size subsystems, iteratively samples from the original 
(large) optimization problem until it reaches a global optimum. The expected runtime is 
linear in the number of input constraints. Our key observation is that the same Clarkson 
ideas are applicable to S-convex optimization problems that have a sampling size given 
by the Helly number h(S ) of the variable domain. Clarkson, in his ground-breaking work, 
applied his ideas already to traditional linear and integer linear optimization because he 
had the S-Helly numbers of S = M. d and S = Tfi. However, he did not explicitly invoke the 
concept of the S-Helly number. By doing this, we now present a direct generalization of his 
algorithms. Our proof that Clarkson’s algorithm extends relies on the theory of violator 
spaces Ell- 

Theorem 1.8. Let S C M. d be a closed set with a finite Helly number h(S). Using Clark¬ 
son’s algorithm, one can find a solution of the S-convex optimization problem 

T 

mm c x 

subject to fi(x ) < 0, fi convex for all i = 1,2,... , m, 
x £ S, 

in an expected O (h(S)m + h(S)°^ s ^) calls to an oracle that solves smaller subsystems 
of the system above of size 0(h(S)). Thus, when a violation primitive oracle runs in 
polynomial time and h{S ) is small, Clarksons algorithm runs in expected linear time in 
the number of constraints. 

2. A Calafiori-Campi Style Algorithm for Chance-Constrained Convex 

S-OPTIMIZATION 

We begin with some formal preliminaries. In all that follows, let S be a proper subset of 
M. d and let H be a probability space. Suppose we have a function /(x, w) : S x 0 — > M which 
is convex on x £ S' and measurable on w. (chance variables that represent stochasticity). 
This parametric function / can be thought of as representing one constraint on S for each 
value of w: given x £ S, x satisfies the constraint if /(x, w) < 0 and violates the constraint 
otherwise. Note that “x violates the constraint” is a random event. We have the following 
definition: 

Definition 2.1 ([S. !]]). Let x £ S be given. The probability of violation of x is defined as 

V(x) = Pr[{w £ fl : f(x,w) > 0}]. 

For example, if we take the uniform probability density (with respect to Lebesgue’s 
measure). Then V(x) is just the volume of those parameters w for which f(x,w) < 0 is 
violated. We seek a solution x with small associated value for V(x), because it means is 
feasible for “most” of the problem instances. When a vector x £ S has a small probability 
of violation V(x), x is said to be approximately feasible (this notion was first studied in 
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Let e € [0,1] represent the tolerance for violation. For x G S if we have Pr[f(x,w) < 
0] > 1 —e, we say x is an e-level feasible solution. In other words, e-level solutions are those 
with V(x) < e. Moreover, we would like to be confident that high probability of violation 
is unlikely to occur among the constraints, so we will use a distrust parameter 5 G [0, 1]. 
When 5 is small, it represents high confidence on the prediction. Our goal is to find x* 
such that 

Pr[{V{x*) > e}] < S. 

Given a linear cost function x H > c T x and the tolerance e, a natural problem is that of 
minimizing c ? x over {x G S'|G(x) < e}. This is a chance-constrained S'-convex optimiza¬ 
tion problem CCP(e ): 


( 2 . 1 ) 


CCP(e) = min 
subject to 


V'(x) < e, 
x G K convex set, 
x G S. 


The key idea to solve CCP(e) is to create a similar, but easier, problem. We may sample 
N i.i.d. values u> 1 ,u; 2 ,... ,w N from 0. This gives us the sampled convex program 


SCP(N) = min c T x 

subject to /(x,u/)< 0, i = l,2,...,N, 
x G K convex set, 
xGS. 

Denote x^v the (uniquely selected) optimal solution of the problem (j 2.2ft . SCP(N) can 
be solved to produce an optimal solution x^r. Note that xjy is a random variable. Since 
xtv is random, {V(xn) > e} is an event with some probability. In fact, V(x N ) is a random 
variable in the space A N with product probability measure Pr x Pr x • • • x Pr = Pr N . 
We claim that for large enough IV, we have Pr[{V(x N ) > e}] < S. 

The key point of the sampling algorithms is to show that this is satisfied for sufficiently 
large sample size N. For S = M d , Calafiore and Campi found what sufficiently large N is 
necessary in UM- Here we extend the scenario approximation scheme from El El to the 
mixed-integer case, S = x M. d ~ k . Note that this includes the pure integer case S = Z d 
as well as other S. 

Before we start the proof of the main results we require a purely technical estimation: 

Lemma 2.2. If 



then (^) (1 — e) N h < 5. 
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Proof. Note that 

N > 1 


1 — r 


In 


+ h + 


In 


H— In 
e 


implies that 


(1 - r)N > I—Jlnl — )+/i+( — ) In — - 1-In (h\). 


h 


re 


1 


Thus 


N > ( — )ln( — )+/i+( — ) In-1 + 


rNe 


re 


--In (h\). 
h J e 


But then, using the fact that ln(a:) > 1 — ^ for positive values of x and applying it to 
x = we obtain 

N > In + h + f-^\ In (N) — - 


From this last equation one can bound the logarithm of 5, in such a way that ln(5) > 
—eA r + e/i + /iln(iV) — ln(/i!). Therefore, using the fact that e - e ( N ~ h ) > (1 — e) N ~ h (because 
—e > ln(l — e)), we obtain 

«(„-») > JV(JV-1)...(JV-A + 1) _ N . k 

~ h! ~ hi ' 

This last inequality can be rewritten as 6 > (^)(1 — e ) N ~ h , hnishing the proof of the 
statement. □ 


We now present the proofs of Theorems 11.31 and 11.41 Recall we are concerned with the 
linear minimization chance-constrained problem CCP(e). We assume that the problem 
has an optimal solution. 

Proof of Theorem, 1 1. ,91 Suppose we have the sampling set {rc 1 ,... ,w N }. Denote again by 
xjv the optimal solution for the auxiliary problem (|2.2I) obtained from the sampling. Note 
that because / is convex, each choice w l gives an S-convex set Ki = {x € K : f(x, w l ) < 0}. 
The proof will require the use of the S'-Helly number h(S). The most important fact to 
do the estimations is that if we have the optimum value of ( 12 . 21 ) . the optimal solution is 
defined by no more than (h(S) — 1) of the Ki. This is because Ki, i = 1... N, together 
with c T x < c T X]\r, is a convex set which has no solutions in S. Thus, by the definition of 
the S'-Helly number, there are no more than h(S) infeasible subfamilies; this means that 
from the original Ki only h(S) — 1 participate. We call these h(S) — 1 subsets the witness 
constraints of the problem ( 12 . 21 ) . 

Let Tat be the set of all possible values N i.i.d samples can take .. ,w N . Now 

consider all possible index sets I C [N] = {1,... , N} of cardinality (h(S) — 1) and define 

Tjv = {(rc 1 ,... ,w N ) € Tat : defines the witness constraints of (| 2 . 2 I) } . 
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Therefore, T n can be written as the union of the for all possible choices of I. Using 
this we will bound the probability that xn is not in the solution set of (12.11) . For simplicity, 
let 

R t = {x € K n S : Pr[f(x, w) < 0] > 1 — e} and G e = (K n S) \ R e . 


Pr [{(w 1 ,.. -,w N ) € T n : x N <E G e }] < 
Y Pr [{(w\... ,w N ) £ : xi € G e }] = 


IC[N],\I\=(h(S)-l) 


Pr 

lc[N],\l\=(h(S)-l) 

Pr 

IC[N],\I\=(h(S)-l) 

Pr 


{OAe/ : X I € °e} n {{w l ) ieI : f (®/X) < 0,J i /} 
wi )iel :x l^ G < 


■} 

{(Oie/ : <0 ,j$l} | {(«>‘) ie/ :zj€G e } 


yy Pr 

IC[N],\I\=(h(S)-l) 

II Prl^)^: f( Xl ,w j ) < 0 ] | {M ieJ :xj€G e } 


< 


H 1 


N 

h (S) — 1 


(1-e 


,N—(h(S)—l) 


The last inequality is true because, for the first type of factors probability is less than 
or equal 1, and for the second type (in the product) each of the factors in the product has 
probability no more than 1 — e. Therefore we wish to choose N in such a way that we get 

thatUj^K l- e) ^W-i)<5. 

Finally, from Lemma 12.31 one can derive the bound stated in the theorem by two simple 
observations: First simply set h = ( h(S ) — 1), second the last term can be dropped because 
it is not positive (this is the case since n\ > ( n/e ) n ). In addition one can take r to be any 
value between 0 and 1. Thus, taking r = 1/2 one gets the statement of the theorem. □ 


Proof of Theorem \ l.f\ The first claim is trivial: Since N is sufficiently large, it follows 
from Theorem II.31 that with probability at least 1 — 6 , the optimal solution x N of SCP(N) 
is a feasible solution of CCP(e). Hence J e < c T x N = J N . with probability at least 1 — 5. 

For the second claim, there are two cases: CCP(e i) feasible and CCP(e i) infeasible. 
If CCP(e i) is infeasible, then J N < oo = J ei . Suppose CCP(e i) is feasible and consider 
an arbitrary x € K n S which is feasible for CCP(ei). That is, let x € K n S such that 
Pr[f(x, w) < 0] > 1 — ei = (1 — Since the N samples in SCP(N) are independently 

chosen, the probability that x is feasible for SCP(N) is at least ((1 — 5 ) l//,v ) N = 1 — 5. 

Since CCP(e i) is feasible, there is a sequence of vectors {xf) which are feasible for 
CCP(e i) such that c T Xi converges to J £l . Because these vectors are feasible for SCP(N) 
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with probability at least 1 — 5, the probability that J N < c T Xi is at least 1 — 5 for any 
i € N. It follows that J N < J ei with probability at least 1 — 5. □ 

To conclude it is important to mention that Luedtke and Ahmed m have also studied 
chance constrained optimization. Their results are related to ours in that they obtain 
bounds on N such that with high probability, the solution to a sampled problem is feasible 
in CCP(e ) with high probability. However, their constraints on K and / are different. 
When ATn S' is finite (e.g., purely integer variables), |20] showed that it is sufficient to take 

N>^- \n Q +^ln(|XnS|). 

This bound is better than our Theorem 11.31 as long as h(S ) > ln(| A n S|) (e.g., integer 
variables S = Z d since h(S) = 2 d ). 

Luedtke and Ahmed also studied the mixed integer case too; for Lipschitz continuous /, 
Luedtke and Ahmed showed that if 


/1> 

2 n 

\2LD~ 

2 

r 21 

7 

+ — 


-1— In 

— 

\ S J 

e 

7 

e 

e 


where L is the Lipschitz constant of / and D bounds the diameter of K n S, then if 
f(x,w l ) < —7 for all i € [IV], x is feasible in CCP(e) with probability at least 1 — 5 (this 
is similar to SCP(N) but with an extra tolerance of 7 ). This result is similar to Luedtke 
and Ahmed’s other result in that it depends only on the size of K n S and not on the 
structure of S. However, due to the tolerance constant 7 , it does not say anything about 
the feasibility of points on the boundary of {f(x,w z ) < 0 } (such as the optimum). 

3. A Clarkson-type sampling algorithm for S'-CONVEX optimization 

To show that our introduction of S’-optimization makes a lot of sense we present another 
application besides Theorem 11.31 We show that a Clarkson-type algorithm can be used to 
compute the optimal solutions to a given S-convex optimization problem. This is efficient 
when the number of variables is constant. We consider again the solution of ^-optimization 
problem with linear objective function and convex constraints 

T 

mm c x 

subject to fi(x) < 0, fi convex for all i = 1 , 2 ,... , m, 
x € S. 

We demonstrate that a well-known algorithm due to Clarkson can be extended to S- 
optimization as long as S is closed, has a finite Helly number h(S), and one has can have an 
oracle to solve deterministic small-size subproblems. The method devised by Clarkson m 
works particularly well for geometric optimization problems in few variables. Examples 
of applications include convex and linear programming, integer linear programming, the 
problem of computing the minimum-volume ball or ellipsoid enclosing a given point set in 
M n , the problem of finding the distance of two convex polytopes in M n , and many others. 
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E.g., Clarkson stated the following result about linear programs and integer linear programs 
(ILPs), which gives: 

Theorem 3.1 (Clarkson). Given an m x n matrix A, a vector b € M m and the integer 
program min{c T x : Ax <b,x£ Z n , 0 < x < u}, one can find a solution to this problem in 
a expected number of steps of order 0(n 2 mlog(m))+nlog(m)0(?W 2 ). While the algorithm 
is exponential it gives the best complexity for solving ILPs when the number of variables n 
is fixed. 

Clarkson’s algorithm requires that many small-size subsystems of the original problem 
are solved. This requires the call to an oracle to solve the small systems. The oracle 
originally provided by Clarkson in the case of regular integer programming was Lenstra’s IP 
algorithm in fixed dimension. As a consequence, when the number of variables is constant, 
Clarkson’s algorithm gives a remarkable linear bound on the complexity (see recent work by 
Eisenbrand H5]). Here we prove Theorem II.81 which is a direct generalization of Clarkson’s 
theorem for convex continuous and integral optimization. 

The key idea is to use the theory of violator spaces introduced by Gartner, Matousek, 
Riist and Skovroh m- They showed it can be used as a general framework to work with 
convex optimization problems. Essentially, a violator space is an abstract optimization 
problem in which we have a finite set of constraints or elements H and a function that, 
given any subset of constraints G, indicates which other constraints in H \ G violate the 
feasible solutions to G. If one has a violator space structure, the optimal solution of the 
problem can be computed via a randomized method whose running time is linear in the 
number of constraints defining the problem, and subexponential in the dimension of the 
problem. Violator spaces include all prior abstractions such as LP-type problems [2[ [26] , 
The key definition from m is the following: 

Definition 3.2. A violator space is a pair ( H , V), where H is a finite set and V a mapping 
2 h —>• 2 h , such that the following two axioms hold: 

Consistency: G n V(G) = 0 holds for all G C H, and 

Locality: V(G) = V(T’) holds for all F C G C H such that G n V(T’) = 0. 

There are three important ingredients of every violator space: a basis, the combinatorial 
dimension, and a primitive test (which will be answered by an oracle). First, as in the 
simplex method for linear programming the problem will be defined by bases, thus we need 
to have a notion of basis for our optimal solutions. 

Definition 3.3 (Gartner et al. |TT] ). A basis of a violator space is defined in analogy to a 
basis of a linear programming problem: a minimal set of constraints that defines a solution 
space. Specifically, m Definition 7] defines B C H to be a basis if B n V(F) 0 holds 
for all proper subsets F C B. For G C H, a basis of G is a minimal subset B of G with 
V(S) = V(G). 

Moreover, violator space bases come with a natural combinatorial invariant, which is 
strongly related to the Helly numbers we discussed earlier. The size of a largest basis of 
a violator space ( H , V ) is called the combinatorial dimension of the violator space and 
denoted by 5 = 5(H, V). 
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The primitive test operation is used as black box in all stages of the algorithm, is the 
so-called violation tests primitive. Given a violator space (if, V), some set G C ff, and 
some element /i€ H\G, the primitive test decides whether h G V(G). 

Gartner at al m proved a crucial property: knowing the violations V(G) for all G C H 
is enough to compute a largest basis. To do so, one can utilize Clarkson’s randomized 
algorithm to compute a basis of some violator space (if, V) with m = | if |. 

The main idea to improve over a brute-force search is due to Clarkson CP¬ 
As described above, all one needs is to be able to answer the Primitive query: Given 
G C if and h € H\G, decide whether h G V(G). Second, the runtime is given in terms of 
the combinatorial dimension 5(H, V) and the size of the input set of constraints H. The 
key result we will use in the rest of the paper is about the complexity of finding a basis: 

Theorem 3.4. [TT| Theorem 27] Using Clarkson’s algorithms, a basis of H in a violator 
space (if, V) can be found by answering the primitive query an expected O ( 5\H\ +5°^) 
times. 

Proof of Theorem \1.8[ Let H = {/i, / 2 ,..., f m } be the constraints of the S-convex opti¬ 
mization problem of the statement of Theorem 11.81 We define a the violator set operator 
V(G) for a subset of inequalities G C ff as follows: We provide each S'-program with a 
universal tie-breaking rule, for instance, using lexicographic ordering. A constraint h G if 
is in V(G) if the optimal solution value of the subsystem G with respect to the objective 
function, denoted xq, is not equal to the unique optimal solution of G U {h}, denoted 
xcu{h}- Note that we need to have a total ordering on the possible feasible solutions of G 
and the fact that S is closed to have a unique optimum. 

For our proof we define the violator map as follows: a constraint /i G ff is in V(G) if 
the optimal solutions satisfy xq > xcu{h}- If we assume that G has no feasible solutions, 
we define V(G) as being the empty set. Indeed any new constraint added to the integer 
program can only decrease the number of feasible solutions. We need to check that the 
two conditions presented in the definition of violator spaces, are satisfied. The consistency 
condition is clearly satisfied. 

Assume now that F C G C if and G n V(i ? ) = 0. To show locality we must verify 
that V(F) = V(G). Note that by the hypothesis Gfl V(F) it means that xq = xf because 
otherwise at least one element in G must be in V(i ? ). 

Now we verify first the containment V(F) C V(G). Take h G V(F’); if h £ V(G) then 
Xcu{h} = xg = xf > XFu{h}- However, FL){h} C Gll{/i}. It follows that Xpuih} — Xqu{ h} 
too—a contradiction. Now we check V(G) C V(F’). Take h € V(G), if XFuih.} = xf = 
xg > xcu{h} But then there exist jeG such that g G V(f U{/i}) = V(F) a contradiction. 

Since the two conditions of a violator space are satisfied, all that is left to apply Theorem 
13. 41 is to outline what the combinatorial dimension and the primitive test are. First, a basis 
for G, using this violator space, represents an optimal solution of the S’-subproblem. But 
if we have the optimum value xq, then the optimal solution is defined by no more than 
(h(S) — 1) of the fi. This is because /*, i = 1... N together with c T x < c T x at is an S'-convex 
set which has no solutions in S. Thus by the definition of the 5-Helly number, there are 
no more than h(S) infeasible subfamilies, but this means that from the original /* G G 
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only h(S) — 1 participate. Therefore the combinatorial dimension of this violator space is 
h(S) — 1. The primitive test is provided by an oracle that solves smaller problems of size 
0(h(S)). Therefore, the conclusion of Theorem 11.81 follows by applying Theorem 13.41 □ 

4. Concluding Remarks 

We have shown that the quality guarantees of the sampling method of Calafiore and 
Carnpi can be extended to more abstract convex optimization constraints. Clearly the 
value of these results depends on having a practical algorithm to solve SCP(N). Similarly, 
Clarkson’s method the query h € V(C) is answered via calls to the primitive as a black 
box or oracle. The algorithms we derive are randomized but run in expected polynomial 
time complexity when the number of discrete variables is fixed. Moreover the algorithmic 
complexity is in fact linear in the number of constraints, and it depends on calls to an oracle 
that solves small size subproblems. The size of these smaller subproblems is precisely the 
S'-Helly number. 

In both cases, one requires an oracle to solve or test feasibility of a small-size S-convex 
algorithms. These exist for S = M. d , and for S = Z d ,S = Z d ~ k x M fc , as presented by 
the usual deterministic algorithms for mixed-integer convex optimization. It is possible 
to prove, using the results of pjj, that for S equal to the difference of a lattice with the 
union of finitely many of its sublattices, one can have such an algorithm when all the 
constraints define a polyhedron of fixed dimension. In a forthcoming paper we will present 
experiments that use the sampling bounds shown here to solve some of these problems. 
The development of other such oracles will require the development of some interesting 
mathematics. 
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